AI Model Publishers | Azure AI Foundry

MistralFrench AI startup offering efficient and cost-effective language models, including Mistral 7B and Mixtral.

Overview

Mistral AI is a Paris‑based startup founded in April 2023 by former DeepMind and Meta researchers. In June 2024, it secured a €600 million Series B round that pushed its valuation to €6 billion, making it Europe’s highest‑valued generative‑AI company. The firm develops high‑performance, open‑weight LLMs such as Mistral 7B, Mixtral 8×22B, and the 123‑billion‑parameter flagship Mistral Large, which debuted through a strategic partnership with Microsoft Azure. Its sparse Mixture‑of‑Experts architecture and permissive licensing drive efficient inference and rapid adoption across both open‑source and enterprise communities.

Key Azure AI Foundry Models (July 2025)

Mistral Large 2 (24‑11) – 128 K context, advanced function calling, best‑in‑class multilingual.
Mixtral 8×22B‑Instruct – MoE architecture delivering 90 % GSM8K accuracy at lower cost.

Why Mistral on Azure

Get European‑licensed open weights with Azure’s Content Safety, deterministic scaling, and pay‑as‑you‑go billing, ideal for regulated industries wanting transparent models.

Total Models: 18

Mistral Medium 3 is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge, coding and vision capabilities.

chat-completion

image-classification

Document conversion to markdown with interleaved images and text

image-to-text

Enhanced Mistral Small 3 with multimodal capabilities and a 128k context length.

Ministral 3B is a state-of-the-art Small Language Model (SLM) optimized for edge computing and on-device applications. As it is designed for low-latency and compute-efficient inference, it it also the perfect model for standard GenAI applications that have

chat-completion

Codestral 25.01 by Mistral AI is designed for code generation, supporting 80+ programming languages, and optimized for tasks like code completion and fill-in-the-middle

chat-completion

Document conversion to markdown with interleaved images and text

image-to-text

Model Details The Mistral7BInstructv0.1 Large Language Model (LLM) is a instruct finetuned version of the Mistral7Bv0.1 generative text model using a variety of publicly available conversation datasets. For full details of this model

chat-completion

Mistral Small can be used on any language-based task that requires high efficiency and low latency.

chat-completion

The Mixtral8x7B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. The Mixtral8x7B outperforms Llama 2 70B on most benchmarks with 6x faster inference. Mixtral8x7Bv0.1 is a decoderonly model with 8 distinct groups or the "experts". At every layer, for every token,

chat-completion

The Mixtral8x22BInstructv0.1 Large Language Model (LLM) is an instruct finetuned version of the Mixtral8x22Bv0.1. Inference samples Inference type|Python sample (Notebook)|CLI with YAML |||| Real time|<a href="https://aka.ms/azu

chat-completion

Model Details The Mistral7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mistral7Bv0.1 outperforms Llama 2 13B on all benchmarks tested. For full details of this model please read paper and [release b

text-generation

The Mixtral8x22B Large Language Model (LLM) is a pretrained generative Sparse Mixture of Experts. Mixtral8x22Bv0.1 is a pretrained base model and therefore does not have any moderation mechanisms. Evaluation Results [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/Hugg

text-generation

Mistral Large (2407) is an advanced Large Language Model (LLM) with state-of-the-art reasoning, knowledge and coding capabilities.

chat-completion

Mistral Nemo is a cutting-edge Language Model (LLM) boasting state-of-the-art reasoning, world knowledge, and coding capabilities within its size category.

chat-completion

The Mistral7BInstructv0.2 Large Language Model (LLM) is an instruct finetuned version of the Mistral7Bv0.2. Mistral7Bv0.2 has the following changes compared to Mistral7Bv0.1: 32k context window (vs 8k context in v0.1) Ropetheta = 1e6 No SlidingWindow Attention For full details of

chat-completion

Model Details The Mixtral8x7Bv0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters. Mixtral8x7Bv0.1 outperforms Llama 2 70B on most benchmarks with 6x faster inference. For full details of this model please read [release blog post](https://mistr

text-generation

Mistral Large 24.11 offers enhanced system prompts, advanced reasoning and function calling capabilities.

chat-completion